Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Improved U-Net for seal segmentation of Republican archives
You YANG, Ruhui ZHANG, Pengcheng XU, Kang KANG, Hao ZHAI
Journal of Computer Applications    2023, 43 (3): 943-948.   DOI: 10.11772/j.issn.1001-9081.2022020218
Abstract280)   HTML5)    PDF (1722KB)(99)       Save

Achieving seal segmentation precisely, it is benefit to intelligent application of the Republican archives. Concerning the problems of serious printing invasion and excessive noise, a network for seal segmentation was proposed, namely U-Net for Seal (UNet-S). Based on the encoder-decoder framework and skip connections of U-Net, this proposed network was improved from three aspects. Firstly, multi-scale residual module was employed to replace the original convolution layer of U-Net. In this way, the problems such as network degradation and gradient explosion were avoided, while multi-scale features were extracted effectively by UNet-S. Next improvement was using Depthwise Separable Convolution (DSConv) to replace the ordinary convolution in the multi-scale residual module, thereby greatly reducing the number of network parameters. Thirdly, Binary Cross Entropy Dice Loss (BCEDiceLoss) was used and weight factors were determined by experimental results to solve the data imbalance problem of archives of the Republic of China. Experimental results show that compared with U-Net, DeepLab v2 and other networks, the Dice Similarity Coefficient (DSC), mean Intersection over Union (mIoU) and Mean Pixel Accuracy (MPA) of UNet-S have achieved the best results, which have increased by 17.38%, 32.68% and 0.6% at most, and the number of parameters have decreased by 76.64% at most. It can be seen that UNet-S has good segmentation effect in the dataset of Republican archives.

Table and Figures | Reference | Related Articles | Metrics
Image caption generation model with adaptive commonsense gate
You YANG, Lizhi CHEN, Xiaolong FANG, Longyue PAN
Journal of Computer Applications    2022, 42 (12): 3900-3905.   DOI: 10.11772/j.issn.1001-9081.2021101743
Abstract280)   HTML6)    PDF (2101KB)(86)       Save

Focusing on the issues that the traditional image caption models cannot make full use of image information, and have only single method of fusing features, an image caption generation model with Adaptive Commonsense Gate (ACG) was proposed. Firstly, VC R-CNN (Visual Commonsense Region-based Convolutional Neural Network) was used to extract visual commonsense features and input commonsense feature layer into Transformer encoder. Then, ACG was designed in each layer of encoder to perform adaptive fusion operation on visual commonsense features and encoding features. Finally, the encoding features fused with commonsense information were fed into Transformer decoder to complete the training. Training and testing were carried out on MSCOCO dataset. The results show that the proposed model reaches 39.2, 129.6 and 22.7 respectively on the evaluation indicators BLEU (BiLingual Evaluation Understudy)-4, CIDEr (Consensus-based Image Description Evaluation) and SPICE (Semantic Propositional Image Caption Evaluation), which are improved by 3.2%,2.9% and 2.3% respectively compared with those of the POS-SCAN (Part-Of-Speech Stacked Cross Attention Network) model. It can be seen that the proposed model significantly outperforms Transformer models using single salient region feature and can describe the image content accurately.

Table and Figures | Reference | Related Articles | Metrics